Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Escape username within DN correctly and remove escape_userdn #267

Merged
merged 1 commit into from
Sep 17, 2024

Conversation

consideRatio
Copy link
Member

@consideRatio consideRatio commented Sep 16, 2024

Background

Why escape_userdn was introduced

escape_userdn was introduced after a non-conclusive discussion in #32 (review), where it made things break for some, and fixed things for others.

I think that was a fluke related to other issues, such as using the userdn variable both in a binding operation and in other operations where it was part of a search filter. When a DN is part of a search filter as a value to find, it should be escaped with escape_filter_chars. With #238, we are good at using escape_filter_chars just-in-time where its needed.

Prevent LDAP injections and the escape_rdn function

The config valid_username_regex was this projects key defence for LDAP injection attacks, but since it was introduced there is now a escape_rdn function to help escaping usernames given to be part of a DN via bind_dn_template strings. This is stressed as important in the ldap3 Python package's docs:

image

What should be escaped?

PR changes

  • escape_userdn is made to no longer have any effect and logs a warning if set
  • escape_rdn is applied on a username string when constructing DN(s) from bind_dn_template.
  • Removed a section that escaped , in a username, which was just a piece of what escape_rdn escapes.
  • Misc docs are updated

EDIT: Breaking change

This PR introduced a breaking change in a edge case scenario if both...

  • the following LDAPAuthenticator configuration was used: lookup_dn = True, lookup_dn_user_dn_attribute = "cn", use_lookup_dn_username = True (previous default value)
  • one or more users, that had already signed in at any point in time before, had a comma in the cn attribute's value, as they would if for example the cn attribute values looks like "lastname, firstname"

Then, these users' resulting JupyterHub usernames would be changed from looking like "lastname\\, firstname" to "lastname, firstname".

Related

@consideRatio consideRatio force-pushed the pr/remove-escape-userdn branch 2 times, most recently from 4f7bd11 to 242ff07 Compare September 16, 2024 21:26
@consideRatio consideRatio changed the title Escape DN attribute values according to LDAP spec and remove escape_userdn Escape DN attribute values according to LDAP spec and remove escape_userdn Sep 16, 2024
@consideRatio consideRatio force-pushed the pr/remove-escape-userdn branch from 242ff07 to 1c254c3 Compare September 16, 2024 21:29
@consideRatio consideRatio changed the title Escape DN attribute values according to LDAP spec and remove escape_userdn Escape username within DN correctly and remove escape_userdn Sep 16, 2024
@consideRatio
Copy link
Member Author

@m-erhardt, @Nikolai-Hlubek, @manics, @minrk if you have capacity to review this a bit, I'd greatly appreciate it - this isn't a PR I feel comfortable merging without further input.

I think this is a crucial fix for the reliability and long term maintenance of this project, but I've not really tested the behavior changes from this change in any other setup than the CI system which doesn't have great coverage of configurations and LDAP servers.

@consideRatio consideRatio force-pushed the pr/remove-escape-userdn branch from 1c254c3 to 0cfd040 Compare September 16, 2024 22:16
@consideRatio consideRatio mentioned this pull request Sep 16, 2024
12 tasks
Copy link
Member

@minrk minrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my reading of everything, this looks sensible to me, but I've never used LDAP. Thanks for doing all this work!

conn = self.get_connection(
userdn=search_dn,
userdn=self.lookup_dn_search_user,
Copy link
Member Author

@consideRatio consideRatio Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the DN here isn't escaped with escape_rdn, because its a full DN rather than a part of it. The full DN include commas and equalsigns for example, and they shouldnt be escaped as they are describing structure of the DN itself, but an individual attributes value should be escaped.

We cant do that here though as we have a full DN here and not its parts.

@consideRatio consideRatio force-pushed the pr/remove-escape-userdn branch from 0cfd040 to fee33e2 Compare September 17, 2024 13:38
@consideRatio
Copy link
Member Author

(Rebased on main)

@consideRatio consideRatio merged commit fde899f into jupyterhub:main Sep 17, 2024
5 checks passed
@consideRatio
Copy link
Member Author

Going for a merge and aiming for this to be available to test practically in a beta release

@@ -463,17 +462,12 @@ async def authenticate(self, handler, data):
username, resolved_dn = self.resolve_username(username)
if not username:
return None
if str(self.lookup_dn_user_dn_attribute).upper() == "CN":
# Only escape commas if the lookup attribute is CN
username = re.subn(r"([^\\]),", r"\1\,", username)[0]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The later escape_rdn is a superset of this comma escape. So probably ok.

However the if ... logic is now missing. Is it important to only do this escape at some times?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if logic didn't make sense to me, the username should reasonably always be escaped before being passed to the LDAP server.

Hmmm, i guess this could also have impacted the username chosen if use_lookup_dn_username is true as well though...

Are you a user of that?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for my use case the if condition would always be true as my company uses Active Directory.

Copy link
Member Author

@consideRatio consideRatio Sep 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, the issue I'm seeing now, is that username is assigned with the escaping, and the username variable will, conditionally on use_lookup_dn_username end up being used to become the JupyterHub username, which otherwise can become whatever the user initially wrote in the login prompt.

This is because of this line of code, where resolved_username seen below was previously named username. In other words, the use_lookup_dn_username can be used or not and it doesn't really matter for anything except the resulting JupyterHub username. One doesn't want to change the username for one person over time typically though, as storage can be associated with the JupterHub username.

username = resolved_username if self.use_lookup_dn_username else login_username
auth_state = {
"ldap_groups": ldap_groups,
"user_attributes": user_attributes,
}
return {"name": username, "auth_state": auth_state}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the PR description with this:

EDIT: Breaking change

This PR introduced a breaking change in a edge case scenario if both...

  • the following LDAPAuthenticator configuration was used: lookup_dn = True, lookup_dn_user_dn_attribute = "cn", use_lookup_dn_username = True (previous default value)
  • one or more users, that had already signed in at any point in time before, had a comma in the cn attribute's value, as they would if for example the cn attribute values looks like "lastname, firstname"

Then, these users' resulting JupyterHub usernames would be changed from looking like "lastname\\, firstname" to "lastname, firstname".

userdn = dn.format(username=username)
if self.escape_userdn:
userdn = escape_filter_chars(userdn)
userdn = dn.format(username=escape_rdn(username))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

escape_filter_chars and escape_rdn seem to escape for different things unless I'm missing something.

escape_filter_chars(r"\foo + , (N") -> '\\5cfoo + , \\28N'
escape_rdn(r"\foo + , (N") -> '\\\\foo \\+ \\, (N'

Is it ok to exchange them here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes i think that is right to do, because the rdn escape is for parts of a dn like here. Filter chars wasn't right to use here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a comment to this in another PR merged after this, linking to an RFC

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you are correct. You explained this at the beginning of the PR but first I didn't understand this.

@Nikolai-Hlubek
Copy link

@consideRatio Sorry for the late reply.
I don't know about LDAP so I can only check the logic of the changes. I noticed two things which I wrote comments about, in the commit.

@m-erhardt
Copy link
Contributor

I know, I'm a little late to the party.
Unfortunatley I've been quite busy for the past weeks and had no capacity to dig into this.

As more testing/validation certainly wont hurt before the release of 2.0.0 I've just validated this change by testing the current version from the main-branch (0c65788) on our production system against our ActiveDirectory with the following (unchanged) config.

c.JupyterHub.authenticator_class = 'ldapauthenticator.LDAPAuthenticator'

c.LDAPAuthenticator.server_address = 'our.companys.ldap.server'
c.LDAPAuthenticator.server_port = 636
c.LDAPAuthenticator.use_ssl = True

c.LDAPAuthenticator.lookup_dn = True
c.LDAPAuthenticator.lookup_dn_search_filter = '({login_attr}={login})'
c.LDAPAuthenticator.lookup_dn_search_user = 'CN=jupyter.bind,OU=Users,DC=our,DC=company'
c.LDAPAuthenticator.lookup_dn_search_password = '***'
c.LDAPAuthenticator.user_search_base = 'DC=our,DC=company'
c.LDAPAuthenticator.user_attribute = 'sAMAccountName'
c.LDAPAuthenticator.lookup_dn_user_dn_attribute = 'cn'
c.LDAPAuthenticator.escape_userdn = False
c.LDAPAuthenticator.use_lookup_dn_username = False

c.LDAPAuthenticator.allowed_groups = [
    "CN=JupyterHubGroup,OU=Groups,DC=our,DC=company",
]

I've also validated that the issue described in #237 is still fixed (which is the case).

So thanks @consideRatio for decluttering the whole escape_userdn/escape_rdn issue and improving the documentation.

So for most users using ldapauthenticator with ActiveDirectory this should not be a breaking change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants